Naive Bayes Word Sense Induction
نویسندگان
چکیده
We introduce an extended naive Bayes model for word sense induction (WSI) and apply it to a WSI task. The extended model incorporates the idea the words closer to the target word are more relevant in predicting its sense. The proposed model is very simple yet effective when evaluated on SemEval-2010 WSI data.
منابع مشابه
Applying a Naive Bayes Similarity Measure to Word Sense Disambiguation
We replace the overlap mechanism of the Lesk algorithm with a simple, generalpurpose Naive Bayes model that measures many-to-many association between two sets of random variables. Even with simple probability estimates such as maximum likelihood, the model gains significant improvement over the Lesk algorithm on word sense disambiguation tasks. With additional lexical knowledge from WordNet, pe...
متن کاملComparative study of statistical word sense discrimination techniques
Word sense discrimination aims at automatically determining which instances of an ambiguous word share the same sense. A fully unsupervised technique based on a vector representation of word senses was proposed by Schütze (Schütze, 1998). While the original model was assumed to be Gaussian, practical results were only reported for an approximated model making hard decisions between sense cluste...
متن کاملRaw Corpus Word Sense Disambiguation
A wide range of approaches have been applied to word sense disambiguation. However, most require manually crafted knowledge such as annotated text, machine readable dictionaries or thesari, semantic networks, or aligned bilingual corpora. The reliance on these knowledge sources limits portability since they generally exist only for selected domains and languages. This poster presents a corpus-b...
متن کاملNaïve Bayes Classifier for Arabic Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the process of selecting a sense of an ambiguous word in a given context from a set of predefined senses. Sense Inventory usually comes from a dictionary or thesaurus. In Arabic, the main cause of word ambiguity is the lack of diacritics of the most digital documents so the same word can occurs with different senses. In this paper, we use the rooting algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013